Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: API for high-precision kernel alarms. #16398

Closed
wants to merge 4 commits into from

Conversation

pabigot
Copy link
Collaborator

@pabigot pabigot commented May 26, 2019

This PR defines and implements k_alarm, a light-weight kernel timer that uses cycle-clock precision alarms. A reference implementation on Nordic nRF is provided.

The solution also implements the system tick clock using a k_alarm, with performance similar to the existing approach.

#12068 being worked by Nordic proposes a generic system clock solution based on counters, but I've come to doubt that the counter API can satisfy the performance requirements for a system clock. The approach here is essentially the same as the existing practice, using a dedicated hardware counter but at its full resolution.

The approach here could also be the basis of a solution to #6498 on high-resolution kernel timers. struct _timeout might be replaced by struct k_alarm making this feature available to k_poll, k_busywait, and anything else where precise delays would be useful.

For what it's worth, on Nordic hardware (with a 32 KiHz system clock) I've run applications that have repeating lossless callbacks at 16 KiHz and even 32 KiHz, on both nRF51 and nRF52. I've been using it in my own work for over a month and am very happy with it.

There are gaps that would need to be addressed:

  • This has been tested only on Nordic hardware. In principle I believe the solution should be extensible to anything that offers a compare interrupt on a counter with a fixed rate time-based increment.
  • The PR doesn't include the test application.
  • There is a restriction on the duration of a basic alarm to the half-span of the 32-bit cycle counter (some eighteen hours for Nordic).
  • The coordination between the soc/kernel APIs may need rework.
  • No attempt has been made to address concerns related to userspace.

I don't see any reason in principle why these can't be overcome. If the solution to #12068 works out then great; if not, this might be an alternative approach.

@zephyrbot
Copy link
Collaborator

zephyrbot commented May 26, 2019

All checks are passing now.

Review history of this comment for details about previous failed status.
Note that some checks might have not completed yet.

Add a constant-time operation that splits a list into a portion before a
node and the portion starting at the node, moving the portion before the
node to the end of a different list and leaving the original list
starting at the node.

Add a constant-time list join operation to move an existing list to the
end of a different list.

This supports use cases where a dlist is split at some point
(e.g. timers that have reached their deadline), so that the extracted
elements can be processed without affecting content added to the
dlist during processing.

Upstream-Status: Pending [from upstream 12485, no upstream need]

Signed-off-by: Peter A. Bigot <[email protected]>
k_alarm is analogous to k_timer but with more flexibility and precision.

Alarms are set using an absolute deadline measured in ticks of the
system clock.  The system clock must be maintained as a 64-bit unsigned
counter that will never overflow.  Most operations involve a deadline
specified as a 32-bit unsigned value.  The deadline is compared against
the low 32 bits of a 64-bit system clock.  As long as the system clock
rate does not exceed 64 MHz the 32-bit counter not wrap for at least one
minute; the 64-bit counter permits 10 years at 54 GHz.

There is no alignment of deadlines to Zephyr's notion of a clock tick
(generally 10 ms, minimum 1 ms).  Consequently deadlines are expressed
and triggered with the full accuracy of the system clock, ignoring
delays from higher-priority interrupts.  Further, alarms do not fire
early due to cycles lost as a result of fractional or truncated clock
ticks.

Deadlines that are at or before the current (32-bit) clock value are
considered to be in the past.  For a 32 KiHz system clock this allows
deadlines up to 18:12:15.999969 H:M:S in the future.  (Alarms with
64-bit deadlines could be supported by an additional data structure.)

Unlike k_timer the k_alarm data structure does not support automatic
rescheduling at a fixed interface.  However, the alarm-specific callback
is permitted to reschedule the alarm.  The original deadline is
available in the k_alarm structure for fixed interval reschedule that
ignores delays in callback invocation.  Alternatively, the live clock
can be read and the alarm scheduled at a fixed duration after the time
of last callback.

Alarms that are scheduled with a delay that has already passed become
ready immediately and are processed in order of scheduling within
deadline.  Late alarms scheduled within an alarm callback will be
processed without delay.

Alarms that are ready or scheduled can be cancelled; whether the
cancellation took affect before the callback was invoked is indicated by
the cancellation return value.

Flags allow detection of late-to-set, atomic rescheduling of alarms that
have not fired, and conditional rescheduling of alarms to fire earlier.

Upstream-Status: Pending [probably not acceptable]

Signed-off-by: Peter A. Bigot <[email protected]>
Provides an alternative timer infrastructure that primarily provides
k_alarm support, and supports the legacy tick-based timeout API through
a k_alarm.

Upstream-Status: Inappropriate [conflicts with upstream project]

Signed-off-by: Peter A. Bigot <[email protected]>
The syscall infrastructure generates wrappers for the k_alarm API even
on platforms where it's not enabled, which causes build errors.  Declare
the required data structure when the feature is not enabled.

Upstream-Status: Inappropriate [alarm must support all targets]
Signed-off-by: Peter A. Bigot <[email protected]>
@@ -24,6 +24,10 @@ extern void k_cpu_idle(void);
extern u32_t z_timer_cycle_get_32(void);
#define z_arch_k_cycle_get_32() z_timer_cycle_get_32()

#if CONFIG_ALARM
extern u64_t z_timer_cycle_get_64(void);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we introduce k_cycle_get_64() instead of this internal function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@nordic-krch
Copy link
Contributor

@pabigot thanks for submitting this. I started to review it but may not finish today.

@pabigot pabigot marked this pull request as ready for review May 27, 2019 11:55
@pabigot pabigot added the DNM This PR should not be merged (Do Not Merge) label May 27, 2019
@pabigot
Copy link
Collaborator Author

pabigot commented May 27, 2019

I pressed the "Ready for Review" button Github provided so people could actually use the review infrastructure, which turned this into a mergeable PR. I've marked it DNM as there's a lot of add-on work needed if this is actually going to merge.

Also it's very important to note: The primary driver for this work was high-precision alarms. That the system clock can be implemented using them is a nice side-effect.

Copy link
Contributor

@nordic-krch nordic-krch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pabigot some of my comments. I will continue tomorrow.

* the current cycle counter value.
*/
static ALWAYS_INLINE u32_t cycles_delta_di(void)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it be the same if it just returns (RTC->COUNTER - (u32_t)last_cycles) & COUNTER_MASK)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

/* The low 32-bits of the cycle counter. */
u32_t z_timer_cycle_get_32(void)
{
k_spinlock_key_t key = k_spin_lock(&lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as this function would be used for timestamping in logging, i would like to see it optimized, without spin lock. My idea was following:

  • we need to take into account that rtc wraps and we may just hit the case when it wrapped but rtc interrupt wasn't yet handled.
  • once in the counter span there is a synchronization moment when last_reference32 is stored. It means that when z_timer_cycle_get_32() is called it will return value which is always bigger than ref_timestamp32 unless: RTC wrap just occurred.
  • upperstamp32 - upper 8 bits of the timestamp, incremented (by 2^24) on each wrap. when upperstamp32 wraps ref_timestamp32 is also updated.
    With those in mind getting 32 bit timestamp can be following:
u32_t ret = RTC->COUNTER + `upperstamp32`;
if (ret < last_reference32) {
    // that's the special, very rare case wrapping of 24 bit or wrapping which is not yet handled by interrupt
   ret += COUNTER_SPAN;
}
return ret;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. My testing didn't show a huge impact from using the spinlock. I'd have to think hard about your algorithm to be sure it actually works: there is no last_reference32 and what it means affects correctness. The lock and re-using sysclock_get_32() is much easier to understand.

/* The full cycle counter. This is not standard API; it should be. */
u64_t z_timer_cycle_get_64(void)
{
k_spinlock_key_t key = k_spin_lock(&lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this could also be optimized. we read the timestamp, if it's smaller than last one it means that wrapping is not yet handled (assuming that reference timestamping happens at least once during each subcycle)

void z_clock_set_timeout(s32_t ticks, bool idle)
{
ARG_UNUSED(idle);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as i'm not the fan of ifdefs i would prefer to see if (!IS_ENABLED(CONFIG_TICKLESS_KERNEL)) return;.

Copy link
Collaborator Author

@pabigot pabigot May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh. It's a really large sequence of code to stick into a conditional statement. If this solution has legs I'll consider it along with other perspectives.

Yeah, that works, though I'm not a fan of multiple exits.

But this is queued for next revision, along with replacing #ifdef in z_clock_elapsed() where it makes more sense.

int rv = -EINVAL;
bool update_deadline = false;

if (!alarm) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would rather assert here. Do we expect that to be null?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but user code gets here and if it isn't checked we'd get a hard fault. I'm not a fan of assert since it fails at runtime and Zephyr doesn't have a robust infrastructure for handling failure safely. This way nothing gets corrupted and the application has a chance to detect what went wrong.

rv = -EINVAL;
} else {
alarm->state = K_ALARM_READY;
update_deadline = link_alarm(&ready_list, alarm, now);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if alarm is ready (configured too late) alarm_update_deadline_ will end up setting CC[0] to 4 ticks from now and ready alarm will expire up to 128us from now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? No, the ready list isn't what's used to calculate the deadline. Once it's on the ready list it will be processed at the next opportunity, without affecting the compare register.

Copy link
Contributor

@nordic-krch nordic-krch May 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i referred to the following scenario:
z_impl_k_alarm_schedule is called late. alarm is linked to ready_list and state is set to READY. update_deadline is set to true and z_alarm_update_deadline is called. in alarm_update_deadline_ rc=0 so compare==now but RTC->EVENTS_COMPARE[0] is not set. We end up with compare = now + COUNTER_MIN_DELTA being set in CC register. RTC interrupt will happen 4 ticks from now and alarm will be triggered from there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I understand the case now. Thanks for explaining it.

Yes, the alarm will fire late. This is no different from a sequence where the alarm was scheduled in the future and has reached its deadline by the time the counter is set; or when the delay occurred because something blocked the timer ISR for several ticks.

These are edge cases, and I believe they're handled in a reasonable way with reasonable complexity.

I would love it if there were some way to force a compare event to occur immediately, but AFAICT the ARM/Nordic peripherals don't allow that, so the only solution is to set the compare event to the shortest value we can be sure will fire without lapping. (MSP430 does allow triggering interrupts manually, which made things like this really easy to deal with when I was working in that world.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if instead of calling alarm_update_deadline_ in case of readiness interrupt would be triggered (NVIC_SetPending). Or that can be part of alarm_update_deadline_

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could trigger the ISR, but is there a way to force EVENTS_COMPARE[0] to be set?

This approach would work if we only used the ISR to process CC[0]. I would rather not limit it that way, as I can see use cases for low-latency dedicated callbacks triggered by system time that would use the other channels and bypass the alarm infrastructure, as a Nordic-specific extension (e.g. this might eliminate the need for the Bluetooth stack to maintain its own RTC).

But it is a potential solution if there's a strong consensus that the delay in this case is unacceptable. I personally don't consider it a serious problem that's worth excluding low-latency time-triggered callbacks.

compare = now + COUNTER_MIN_DELTA;
}
RTC->EVENTS_COMPARE[0] = 0;
RTC->CC[0] = compare;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

algorithm does not take into account that previous CC[0] value may be just now and it may expire between event being cleared and setting new CC[0] value. You could look at the algorithm in #15510 (https://github.com/zephyrproject-rtos/zephyr/blob/027cd2b9cb2d1091b93d9b9d1356bebda3f2b061/drivers/counter/counter_nrfx_rtc.c#L168) which also takes care of late setting (adding delta would not be needed).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the problem would only arise if COUNTER incremented twice after now was read, with one of those times being after the check that COMPARE was clear. In that case we'd be at most 2 ticks late. The alternative approach you cite is extremely complicated and I strongly expect the logic required to implement it would have a worse impact than an extremely rare delay in an alarm that's already subject to latency from arbitrary interrupts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand current approach has following consequences:

  • enforcing high priority of RTC interrupt because access to RTC registers must not be interrupted (COUNTER_MIN_DELTA)
  • User callbacks are also executed in RTC interrupt context. That is risky and may impact system performance. Especially, if application requires other RT operations in high priority interrupts. For unaware users that may have even more consequences (e.g. using printk from timeout callback leading to bluetooth disconnections).
  • delaying expiration of late alarms by 4 cycles.

RTC interrupt should rather have low/lowest priority with algorithm that handles late setting.

The alternative approach you cite is extremely complicated

well, it's like 15-20 lines of code, if that would allow lowering interrupt priority i would go for that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RTC registers are protected by a spinlock (disabling interrupts); the priority of the RTC interrupt is not changed by this implementation. Nor is the ability for applications to provide callbacks that are executed from within the ISR. The delay exists as an edge case and is due to limitations in the Nordic hardware; by using a dedicated peripheral rather than channel on peripheral, could be eliminated by the technique you suggested.

If something like k_alarm is to be a replacement for the existing timer infrastructure your concerns should be considered in the context of defining scope and functional requirements that can be used as the basis of target-specific implementation.

* alarm;
* * CANCELLED on k_alarm_cancel().
*/
K_ALARM_READY,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the need for having READY state. It seems to me like this comes from algorithm you've chosen which first determines list of ready timers and then processes it. If algorithm would use only one list and pick head, determine if it expired or not and call user handler there would be no READY state. Only active. READY state is used to determine during cancellation if alarm already expired but it could be determined (more precisely) by comparing alarm value vs now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having two queues is critical to correctly handle cases where a callback reschedules itself or another alarm, as @pizi-nordic should be able to confirm based on past arguments about the way k_timer works.

rv = 1;
sys_dlist_remove(&alarm->node);
} else if (alarm->state == K_ALARM_READY) {
sys_dlist_remove(&alarm->node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to API doc:

if the alarm had reached its deadline but the callback had not been invoked at the point where the cancellation was requested.

but here now vs value is not checked. That would be better "lateness" indicator than state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't have to check. If it's READY it's on the ready queue and hasn't been processed, so it can simply be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the case where we see that alarm is READY happens when we interrupted RTC interrupt so we are in higher priority interrupt. We may as well interrupted RTC interrupt before ready state is determined and alarm also may be already late and that case is not covered and from user perspective they are the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alarm is also observably READY when multiple alarms reach their deadline in the same timer ISR, and one callback tries to cancel another alarm while it's still on the queue.

I think this can be resolved by clarifying that "reached its deadline" does not just mean that the RTC counter has advanced to the deadline (which we can't easily observe), but that the alarm and timer infrastructure has recognized that event has happened (which we can observe, because it's reflected in the alarm state being READY).

I will reword the documentation to make that clear.

This relates to your comment about the return value from k_alarm_cancel(). As a user what I want to know is what the state of the alarm was (distinguish SCHEDULED from READY), not a difference in ticks that was captured at some arbitrary point in the past. Any such difference is irrelevant since we don't know how long ago it was calculated.

* @retval -EINVAL if the alarm was in any other state than SCHEDULED
* or READY when the cancellation was requested.
*/
__syscall int k_alarm_cancel(struct k_alarm *alarm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see a benefit of having distinction between return value 0 and positive. We managed to cancel alarm which was in the state that would result in expire callback or we canceled alarm which would not expire because it was not active (not started or already expired). We can check if we reach alarm by reading k_cycle_get32() and comparing it against alarm->deadline. Maybe we need k_alarm_get_deadline() API?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't see a benefit of having distinction between return value 0 and positive.

And that's why counter cancellation doesn't provide this information and #12624/#12625 stay open. Please understand that I do find knowing the state of the alarm to be useful, and it's not helpful to make me (as an application developer) compare values (one of which changes asynchronously) and try to reconstruct it when I could just have been given the information.

I've never needed k_alarm_get_deadline(). Because this API doesn't use relative alarms it's always the value I last set it to, and I can read that from the k_alarm data structure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree that knowing the state of the alarm may be important but it's just that this information is not reliable because by the time i get the information deadline may already be met (but was not when alarm was canceled). maybe k_alarm_cancel should return:

  • -EINVAL if alarm is not SCHEDULED or ACTIVE
  • number of ticks before deadline when alarm was canceled

That would be more or less the same but positive value would have additional information.
Note that returning 0 may only happen in higher priority interrupt. Number of ticks left would allow conditions like if ((u32_t)ret < 2) alarm_reached();

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #16398 (comment).

Can you provide a use case where having the time remaining at the point of cancellation provides more value than simply knowing the state of the alarm when it was cancelled, given that you don't know when the cancellation occurred?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will ask the similar question 😄 What is the use case for knowing that alarm infrastructure marked alarm as ready. If cancellation returns positive value, how can you be sure that it did not expire by the time you process that information (you may be interrupted for multiple ticks). If it is vital to do something special when cancelled alarm more or less reached deadline (but was cancelled before callback) you can lock, get now compare it with deadline and if it is close (user decides how close) you do something.

RTC->CC[0] = compare;
}

void z_alarm_update_deadline(void)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there is alarm_update_deadline_() if it is used only here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was used elsewhere in earlier versions. I also think it's clearer as a separate function than in the body of a conditional, but I do think it should be made ALWAYS_INLINE since there's a single call site and a stupid compiler might fail to do the right thing.

* @retval positive if there is a scheduled alarm. *deadlinep will
* contain the corresponding deadline, which may be in the past.
*/
int k_alarm_next_deadline_(u32_t *deadlinep);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the _ at the end of the name. (This is not used in any other kernel function afaik)

*
* @retval postive the number of alarms added to the ready list.
*/
int k_alarm_split_(u32_t now);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

*
* @note This is not user API.
*/
void k_alarm_process_ready_(void);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -125,6 +126,9 @@ struct k_stack;
struct k_mem_slab;
struct k_mem_pool;
struct k_timer;
#if CONFIG_ALARM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this ifdef.

See CONFIG_POLL for instance

if (!alarm_due_by(cp, alarm->deadline)) {
break;
}
cp = sys_dlist_peek_next_no_check(list, cp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aerate please, no need to coalesce } and next lines (unless it's another '}')

Apply that everywhere.

goto unlock;
}

sys_dnode_t *hp = sys_dlist_peek_head(&scheduled_list);
Copy link
Collaborator

@tbursztyka tbursztyka May 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never declare a variable in the middle of a code block. (apply that everywhere)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate where this "never" comes from? This uninitialised variable safety issue was fixed in C99, twenty years ago.

&& (flags & (K_ALARM_FLAG_REPLACE | K_ALARM_FLAG_IF_SOONER));

if (sys_dnode_is_linked(&alarm->node)
&& !may_resched) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put the && (or || etc...) at the end of the last line. See other kernel code (or everywhere else actually). Apply that everywhere

Copy link
Collaborator

@tbursztyka tbursztyka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style issues

@pabigot
Copy link
Collaborator Author

pabigot commented May 28, 2019

The review comments have revealed a latent but important design decision underlying the proposed API: the state of an alarm is defined at the points where the alarm infrastructure observes it.

External observations are irrelevant to the API even if there are cases where an application wants to make them. This is much like the existing timer in tickless, where the clock doesn't advance until z_clock_announce increments curr_tick, even in cases where additional ticks are pending that will be added in the same invocation of z_clock_announce.

Some of the review comments related to details of handling compare register updates and interpretation of alarm status (as opposed to state) are answered by taking this decision into account.

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... just to get the argument out there. This is duplicating virtually everything the existing timer subsystem does, with a ton of code that needs to be added in addition to what's there. That's.. sorta borderline ridiculous in a size constrained RTOS.

It seems like the killer features here are that it's eliminated "ticks" as an API point and that it uses (and requires) a 64 bit cycle counter to avoid the resulting precision issues.

If that's the case, just do it and make this the new timer subsystem. I promise I won't stand in the way.

You'll probably need a fallback layer for systems with very fast clocks and/or limited ability to tolerate a 64 bit timer (remember 64 bit math has atomicity problems and that some systems (ARC, I'm looking at you) can't do proper atomics in userspace without system calls). And it will have to drop in as the backend for k_timer, thread timeouts and timeslicing.

And obviously we need drivers for the other platforms if this is going to be a core API.

But having every nRF device carry around two feuding timer subsystems because of a Conway's law kind of situation is silly, IMHO.

@nordic-krch
Copy link
Contributor

Summarizing my review: I like the switch from delta value to absolute values. We did the same with the system timer (called app_timer) in nRF5 SDK. For years we had app_timer with delta timeouts but that had a lot of issues. Then we switched to absolute values and that simplified the implementation (removal of accumulative error in periodic timers was another plus). There, to avoid locking for too long i've had list of requests and list of scheduled. When timer was started, start request was enqueued in the request list (unsorted) and RTC interrupt was triggered (NVIC_SetPending). In RTC interrupt context requests where processed and moved to scheduled list. With this approach, processing of scheduled timers happened only in one context so no locking there was needed (cancellation needs to be considered there).

My main worry regarding nrf RTC driver is high interrupt level requirement. More here: #16398 (comment)

If RTC interrupt can be on lower interrupt level then significance of frequent locking diminishes.

Regarding k_alarm API, it looks ok, minor issues regarding k_alarm_cancel return value.

@pabigot
Copy link
Collaborator Author

pabigot commented Jun 21, 2019

Closing this as no longer relevant. The solution in #16782 provides high-resolution timing, and the alarm enhancements I want can be built on top of that in a more general way than this architecture allows.

@pabigot pabigot closed this Jun 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Kernel area: Timer Timer DNM This PR should not be merged (Do Not Merge)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants